Archivos y Bases de datos

La idea de este taller es manipular archivos (leerlos, parsearlos y escribirlos) y hacer lo mismo con bases de datos estructuradas.

Ejercicio 1

Baje el archivo de "All associations with added ontology annotations" del GWAS Catalog.

Describa las columnas del archivo (que información estamos mirando? Para qué sirve? Por qué la hicieron?)


In [9]:
import pandas as pd

DF = pd.read_csv('../data/alternative.tsv', sep='\t')
DF


Out[9]:
DATE ADDED TO CATALOG PUBMEDID FIRST AUTHOR DATE JOURNAL LINK STUDY DISEASE/TRAIT INITIAL SAMPLE SIZE REPLICATION SAMPLE SIZE ... P-VALUE PVALUE_MLOG P-VALUE (TEXT) OR or BETA 95% CI (TEXT) PLATFORM [SNPS PASSING QC] CNV MAPPED_TRAIT MAPPED_TRAIT_URI STUDY ACCESSION
0 2009-09-28 18403759 Ober C 2008-04-09 N Engl J Med www.ncbi.nlm.nih.gov/pubmed/18403759 Effect of variation in CHI3L1 on serum YKL-40 ... YKL-40 levels 632 Hutterite individuals 443 European ancestry cases, 491 European ance... ... 1e-13 13.000000 NaN 0.30 [NR] ng/ml decrease Affymetrix [290325] N YKL40 measurement http://www.ebi.ac.uk/efo/EFO_0004869 GCST000177
1 2008-06-16 18369459 Liu Y 2008-04-04 PLoS Genet www.ncbi.nlm.nih.gov/pubmed/18369459 A genome-wide association study of psoriasis a... Psoriasis 218 European ancestry cases, 519 European ance... 1,153 European ancestry cases, 1,217 European ... ... 2e-06 5.698970 NaN 1.41 [1.22-1.61] Illumina [305983] N psoriasis http://www.ebi.ac.uk/efo/EFO_0000676 GCST000173
2 2008-06-16 18385676 Amos CI 2008-04-03 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18385676 Genome-wide association scan of tag SNPs ident... Lung cancer 1,154 European ancestry cases, 1,137 European ... 2,724 European ancestry cases, 3,694 European ... ... 3e-18 17.522879 NaN 1.30 [1.15-1.47] Illumina [317498] N lung carcinoma http://www.ebi.ac.uk/efo/EFO_0001071 GCST000172
3 2008-06-16 18385676 Amos CI 2008-04-03 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18385676 Genome-wide association scan of tag SNPs ident... Lung cancer 1,154 European ancestry cases, 1,137 European ... 2,724 European ancestry cases, 3,694 European ... ... 7e-06 5.154902 NaN 1.22 [1.10-1.35] Illumina [317498] N lung carcinoma http://www.ebi.ac.uk/efo/EFO_0001071 GCST000172
4 2008-06-16 18385676 Amos CI 2008-04-03 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18385676 Genome-wide association scan of tag SNPs ident... Lung cancer 1,154 European ancestry cases, 1,137 European ... 2,724 European ancestry cases, 3,694 European ... ... 8e-06 5.096910 NaN 1.16 [1.05-1.28] Illumina [317498] N lung carcinoma http://www.ebi.ac.uk/efo/EFO_0001071 GCST000172
5 2008-06-16 18385738 Hung RJ 2008-04-03 Nature www.ncbi.nlm.nih.gov/pubmed/18385738 A susceptibility locus for lung cancer maps to... Lung cancer 1,926 European ance other ancestry cases, 2,52... 332 European ancestry cases, 462 European ance... ... 5e-20 19.301030 NaN 1.30 [1.23-1.37] Illumina [310023] N lung carcinoma http://www.ebi.ac.uk/efo/EFO_0001071 GCST000170
6 2008-09-16 18385739 Thorgeirsson TE 2008-04-03 Nature www.ncbi.nlm.nih.gov/pubmed/18385739 A variant associated with nicotine dependence,... Nicotine dependence 10,995 European ancestry individuals 4,848 European ancestry individuals ... 6e-20 19.221849 NaN 0.10 [0.08-0.12] cigarettes per day increase Illumina [306207] N nicotine dependence http://www.ebi.ac.uk/efo/EFO_0003768 GCST000171
7 2008-06-16 18372901 Tenesa A 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372901 Genome-wide association scan identifies a colo... Colorectal cancer 981 European ancestry cases, 1,002 European an... 10,287 European ancestry cases, 10,401 Europea... ... 9e-26 25.045757 NaN 1.19 [1.15-1.23] Illumina [541628] N colorectal cancer http://www.ebi.ac.uk/efo/EFO_0005842 GCST000168
8 2008-06-16 18372901 Tenesa A 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372901 Genome-wide association scan identifies a colo... Colorectal cancer 981 European ancestry cases, 1,002 European an... 10,287 European ancestry cases, 10,401 Europea... ... 6e-10 9.221849 NaN 1.11 [1.08-1.15] Illumina [541628] N colorectal cancer http://www.ebi.ac.uk/efo/EFO_0005842 GCST000168
9 2008-06-16 18372901 Tenesa A 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372901 Genome-wide association scan identifies a colo... Colorectal cancer 981 European ancestry cases, 1,002 European an... 10,287 European ancestry cases, 10,401 Europea... ... 8e-28 27.096910 NaN 1.20 [1.16-1.24] Illumina [541628] N colorectal cancer http://www.ebi.ac.uk/efo/EFO_0005842 GCST000168
10 2008-06-16 18372905 Tomlinson IP 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372905 A genome-wide association study identifies col... Colorectal cancer 922 European ancestry cases, 927 European ance... 17,089 European ancestry cases, 16,862 Europea... ... 3e-13 12.522879 NaN 1.12 [1.10-1.16] Illumina [547647] N colorectal cancer http://www.ebi.ac.uk/efo/EFO_0005842 GCST000169
11 2008-06-16 18372905 Tomlinson IP 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372905 A genome-wide association study identifies col... Colorectal cancer 922 European ancestry cases, 927 European ance... 17,089 European ancestry cases, 16,862 Europea... ... 3e-18 17.522879 NaN 1.27 [1.20-1.34] Illumina [547647] N colorectal cancer http://www.ebi.ac.uk/efo/EFO_0005842 GCST000169
12 2008-06-16 18372903 Zeggini E 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372903 Meta-analysis of genome-wide association data ... Type 2 diabetes 4,549 European ancestry cases, 5,579 European ... 24,194 European ancestry cases, 55,598 Europea... ... 5e-14 13.301030 NaN 1.10 [1.07-1.13] Affymetrix, Illumina [2202892] (imputed) N type II diabetes mellitus http://www.ebi.ac.uk/efo/EFO_0001360 GCST000167
13 2008-06-16 18372903 Zeggini E 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372903 Meta-analysis of genome-wide association data ... Type 2 diabetes 4,549 European ancestry cases, 5,579 European ... 24,194 European ancestry cases, 55,598 Europea... ... 1e-10 10.000000 NaN 1.11 [1.07-1.14] Affymetrix, Illumina [2202892] (imputed) N type II diabetes mellitus http://www.ebi.ac.uk/efo/EFO_0001360 GCST000167
14 2008-06-16 18372903 Zeggini E 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372903 Meta-analysis of genome-wide association data ... Type 2 diabetes 4,549 European ancestry cases, 5,579 European ... 24,194 European ancestry cases, 55,598 Europea... ... 1e-09 9.000000 NaN 1.09 [1.06-1.12] Affymetrix, Illumina [2202892] (imputed) N type II diabetes mellitus http://www.ebi.ac.uk/efo/EFO_0001360 GCST000167
15 2008-06-16 18372903 Zeggini E 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372903 Meta-analysis of genome-wide association data ... Type 2 diabetes 4,549 European ancestry cases, 5,579 European ... 24,194 European ancestry cases, 55,598 Europea... ... 1e-09 9.000000 NaN 1.15 [1.10-1.20] Affymetrix, Illumina [2202892] (imputed) N type II diabetes mellitus http://www.ebi.ac.uk/efo/EFO_0001360 GCST000167
16 2008-06-16 18372903 Zeggini E 2008-03-30 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18372903 Meta-analysis of genome-wide association data ... Type 2 diabetes 4,549 European ancestry cases, 5,579 European ... 24,194 European ancestry cases, 55,598 Europea... ... 1e-08 8.000000 NaN 1.09 [1.06-1.12] Affymetrix, Illumina [2202892] (imputed) N type II diabetes mellitus http://www.ebi.ac.uk/efo/EFO_0001360 GCST000167
17 2008-06-16 18326623 Gold B 2008-03-11 Proc Natl Acad Sci U S A www.ncbi.nlm.nih.gov/pubmed/18326623 Genome-wide association study provides evidenc... Breast cancer 249 Ashkenazi Jewish non-BRCA1/2 carriers case... 1,193 Ashkenazi Jewish non-BRCA1/2 carriers c... ... 3e-08 7.522879 NaN 1.41 [1.25-1.59] Affymetrix [150080] N breast carcinoma http://www.ebi.ac.uk/efo/EFO_0000305 GCST000162
18 2008-07-22 18332876 Kirov G 2008-03-11 Mol Psychiatry www.ncbi.nlm.nih.gov/pubmed/18332876 A genome-wide association study in 574 schizop... Schizophrenia 574 European ancestry trios, 605 European ance... NaN ... 1e-06 6.000000 NaN NaN NaN Illumina [~ 550000] N schizophrenia http://www.ebi.ac.uk/efo/EFO_0000692 GCST000163
19 2008-09-17 18327256 Doring A 2008-03-09 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18327256 SLC2A9 influences uric acid concentrations wit... Urate levels 1,644 European ancestry individuals 9,947 European ancestry individuals ... 3e-70 69.522879 NaN 0.35 [NR] mg/dl decrease in uric acid Affymetrix [335152] N urate measurement http://www.ebi.ac.uk/efo/EFO_0004531 GCST000161
20 2008-09-17 18327257 Vitart V 2008-03-09 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18327257 SLC2A9 is a newly identified urate transporter... Urate levels 794 European ancestry individuals 706 European ancestry individuals ... 3e-09 8.522879 NaN 0.88 [NR] uM decrease in uric acid [females only] Illumina [308140] N urate measurement http://www.ebi.ac.uk/efo/EFO_0004531 GCST000160
21 2008-06-16 18311140 Hunt KA 2008-03-02 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18311140 Newly identified genetic risk variants for cel... Celiac disease 767 European ancestry cases, 1,422 European an... 1,643 European ancestry cases, 3,406 European ... ... 3e-11 10.522879 NaN 1.39 [1.26-1.53] Illumina [310605] N celiac disease http://www.ebi.ac.uk/efo/EFO_0001060 GCST000157
22 2008-06-16 18311140 Hunt KA 2008-03-02 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18311140 Newly identified genetic risk variants for cel... Celiac disease 767 European ancestry cases, 1,422 European an... 1,643 European ancestry cases, 3,406 European ... ... 4e-09 8.397940 NaN 1.28 [1.18-1.39] Illumina [310605] N celiac disease http://www.ebi.ac.uk/efo/EFO_0001060 GCST000157
23 2008-06-16 18311140 Hunt KA 2008-03-02 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18311140 Newly identified genetic risk variants for cel... Celiac disease 767 European ancestry cases, 1,422 European an... 1,643 European ancestry cases, 3,406 European ... ... 1e-09 9.000000 NaN 1.35 [1.23-1.49] Illumina [310605] N celiac disease http://www.ebi.ac.uk/efo/EFO_0001060 GCST000157
24 2008-06-16 18311140 Hunt KA 2008-03-02 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18311140 Newly identified genetic risk variants for cel... Celiac disease 767 European ancestry cases, 1,422 European an... 1,643 European ancestry cases, 3,406 European ... ... 5e-09 8.301030 NaN 1.23 [1.15-1.32] Illumina [310605] N celiac disease http://www.ebi.ac.uk/efo/EFO_0001060 GCST000157
25 2008-06-16 18311140 Hunt KA 2008-03-02 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18311140 Newly identified genetic risk variants for cel... Celiac disease 767 European ancestry cases, 1,422 European an... 1,643 European ancestry cases, 3,406 European ... ... 7e-08 7.154902 NaN 1.21 [1.13-1.30] Illumina [310605] N celiac disease http://www.ebi.ac.uk/efo/EFO_0001060 GCST000157
26 2008-06-16 18264097 Eeles RA 2008-02-10 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18264097 Multiple newly identified loci associated with... Prostate cancer 1,854 European ancestry cases, 1,894 European ... 3,268 European ancestry cases, 3,366 European ... ... 9e-29 28.045757 NaN 1.25 [1.17-1.34] Illumina [541129] N prostate carcinoma http://www.ebi.ac.uk/efo/EFO_0001663 GCST000152
27 2008-06-16 18264097 Eeles RA 2008-02-10 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18264097 Multiple newly identified loci associated with... Prostate cancer 1,854 European ancestry cases, 1,894 European ... 3,268 European ancestry cases, 3,366 European ... ... 2e-18 17.698970 NaN 1.20 [1.10-1.33] Illumina [541129] N prostate carcinoma http://www.ebi.ac.uk/efo/EFO_0001663 GCST000152
28 2008-06-16 18264097 Eeles RA 2008-02-10 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18264097 Multiple newly identified loci associated with... Prostate cancer 1,854 European ancestry cases, 1,894 European ... 3,268 European ancestry cases, 3,366 European ... ... 2e-12 11.698970 NaN 1.19 [1.11-1.27] Illumina [541129] N prostate carcinoma http://www.ebi.ac.uk/efo/EFO_0001663 GCST000152
29 2008-06-16 18264097 Eeles RA 2008-02-10 Nat Genet www.ncbi.nlm.nih.gov/pubmed/18264097 Multiple newly identified loci associated with... Prostate cancer 1,854 European ancestry cases, 1,894 European ... 3,268 European ancestry cases, 3,366 European ... ... 6e-10 9.221849 NaN 1.17 [1.08-1.26] Illumina [541129] N prostate carcinoma http://www.ebi.ac.uk/efo/EFO_0001663 GCST000152
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
35299 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 1E-10 10.000000 NaN 0.38 [0.26-0.5] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35300 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 5E-19 18.301030 NaN 0.54 [0.42-0.66] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35301 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 1E-22 22.000000 NaN 0.60 [0.48-0.72] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35302 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 1E-7 7.000000 NaN 0.32 [0.2-0.44] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35303 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 4E-10 9.397940 NaN 0.40 [0.28-0.52] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35304 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 2E-9 8.698970 NaN 0.36 [0.24-0.48] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35305 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 4E-16 15.397940 NaN 0.53 [0.41-0.65] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35306 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 2E-6 5.698970 NaN 0.48 [0.28-0.68] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35307 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 7E-8 7.154902 NaN 0.48 [0.3-0.66] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35308 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 1E-10 10.000000 NaN 0.38 [0.26-0.5] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35309 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 5E-6 5.301030 NaN 0.80 [0.45-1.15] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35310 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 6E-29 28.221849 NaN 0.75 [0.61-0.89] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35311 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 8E-9 8.096910 NaN 0.39 [0.25-0.53] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35312 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 4E-14 13.397940 (African American) 0.94 [0.7-1.18] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35313 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 2E-13 12.698970 (European) 0.56 [0.4-0.72] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35314 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 3E-23 22.522879 (European) 0.44 [0.28-0.6] millisecond decrease Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35315 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 7E-6 5.154902 (African American) 0.86 [0.7-1.02] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35316 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 3E-7 6.522879 (European) 0.53 [0.51-0.55] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35317 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 5E-8 7.301030 NaN 0.47 [0.29-0.65] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35318 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 1E-6 6.000000 (European) 0.48 [0.46-0.5] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35319 2017-02-08 27577874 Evans DS 2016-08-29 Hum Mol Genet www.ncbi.nlm.nih.gov/pubmed/27577874 Fine-mapping, Novel Loci Identification, and S... QRS duration up to 13,031 African Americans individuals, 40... NaN ... 8E-9 8.096910 NaN 0.44 [0.28-0.6] millisecond increase Affymetrix, Illumina [2955816] (imputed) N QRS duration http://www.ebi.ac.uk/efo/EFO_0005055 GCST003598
35320 2017-02-10 27564568 Liu C 2016-08-26 Clin Pharmacol Ther www.ncbi.nlm.nih.gov/pubmed/27564568 A Genome-wide Approach Validates that Thiopuri... Thiopurine methyltransferase activity in acute... up to 407 European ancestry cases, up to 138 A... NaN ... 9E-61 60.045757 NaN 7.69 unit decrease Affymetrix, Illumina [9000000] (imputed) N thiopurine methyltransferase activity measurem... http://www.ebi.ac.uk/efo/EFO_0007852, http://w... GCST003609
35321 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (total hip) 2,729 Korean ancestry individuals 1,547 Chinese Han ancestry individuals, 3,237 ... ... 3E-6 5.522879 NaN 0.14 unit decrease Affymetrix [328918] N hip bone mineral density http://www.ebi.ac.uk/efo/EFO_0007702 GCST003611
35322 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (total hip) 2,729 Korean ancestry individuals 1,547 Chinese Han ancestry individuals, 3,237 ... ... 2E-7 6.698970 (females) 0.16 unit increase Affymetrix [328918] N hip bone mineral density http://www.ebi.ac.uk/efo/EFO_0007702 GCST003611
35323 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (total hip) 2,729 Korean ancestry individuals 1,547 Chinese Han ancestry individuals, 3,237 ... ... 9E-9 8.045757 NaN 0.12 unit increase Affymetrix [328918] N hip bone mineral density http://www.ebi.ac.uk/efo/EFO_0007702 GCST003611
35324 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (total hip) 2,729 Korean ancestry individuals 1,547 Chinese Han ancestry individuals, 3,237 ... ... 4E-8 7.397940 (females) 0.15 unit increase Affymetrix [328918] N hip bone mineral density http://www.ebi.ac.uk/efo/EFO_0007702 GCST003611
35325 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (femoral neck) 2,729 Korean ancestry individuals 1,547 Han Chinese ancestry individuals, 3,237 ... ... 4E-7 6.397940 NaN 0.15 unit decrease Affymetrix [328918] N femoral neck bone mineral density http://www.ebi.ac.uk/efo/EFO_0007785 GCST003612
35326 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (femoral neck) 2,729 Korean ancestry individuals 1,547 Han Chinese ancestry individuals, 3,237 ... ... 2E-7 6.698970 (females) 0.16 unit increase Affymetrix [328918] N femoral neck bone mineral density http://www.ebi.ac.uk/efo/EFO_0007785 GCST003612
35327 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (femoral neck) 2,729 Korean ancestry individuals 1,547 Han Chinese ancestry individuals, 3,237 ... ... 1E-6 6.000000 NaN 0.10 unit increase Affymetrix [328918] N femoral neck bone mineral density http://www.ebi.ac.uk/efo/EFO_0007785 GCST003612
35328 2017-02-13 27424934 Choi HJ 2016-07-15 Bone www.ncbi.nlm.nih.gov/pubmed/27424934 Genome-wide association study in East Asians s... Bone mineral density (femoral neck) 2,729 Korean ancestry individuals 1,547 Han Chinese ancestry individuals, 3,237 ... ... 6E-7 6.221849 (females) 0.13 unit increase Affymetrix [328918] N femoral neck bone mineral density http://www.ebi.ac.uk/efo/EFO_0007785 GCST003612

35329 rows × 37 columns

Qué Entidades (tablas) puede definir?

-Entidades intermedias -Modelos de entidad y relación -llaves foraneas (lineas que conectan entidades) -como desde python meter datos en mysql


In [ ]:
DF['Berri1'].plot() # plot easier

Cree la base de datos (copie el código SQL que se usó)


In [ ]:

Ejercicio 2

Lea el archivo y guarde la infomación en la base de datos en las tablas que se definidieron en el Ejercicio 1.


In [ ]:

Ejercicio 3

Realize de la base de datos una consulta que le responda una pregunta biológica (e.g. qué genes estan relacionados con cuales enfermedades)


In [ ]:

Ejercicio 4

Guarde el resultado de la consulta anterior en un archivo csv


In [ ]: